(19) 



J 



Eurepfllsches Patentamt 
European Patent Office 
Office europeen dea brevets 



(12) 



l 

EP 0902 379 A2 

EUROPEAN PATENT APPLICATION 



(43) DaleolpuMcalion: 

17X13.1999 BuUettn 1999/11 

(21) Application number 983073347 

(22) Dated filing: 10.09.1998 



(51) Intel* G06F 17/24. G06F 17/21, 
G06F 17/27 



(64) Oesignated Contracting States: 


• Price, Morgan M 


ATBECHCYDE OK ESFI FRGB GRIE ITUIU 


Palo Alto, Cetttomb 94306 (US) 


MCNLPTSE 


• GoJovchlnsky, Gene 


Oesignated Extension States: 


Palo Alto, CatHomta 94306 (US) 


ALLTLVMKROS) 


• Weber, Mark D. 




Palo Alto, Callfomb 94301 (US) 


(30) Priority: 15.09.1997 US 929427 






(74) Representative: Skene James, Robert Edmund 


(71) Applicant: XEROX CORPORATION 


GILL JENNINGS & EVERY 


Rochester. New York 14644 (US) 


Broadgato House 




7EMon Street 


(72) Inventors: 


London EC2M 7LH (GB) 


* Schllll, William N. 




Palo Alto, CaMomta 94304 (US) 





(54) A method and system for organizing documents based upon annotations in context 



(57) A document organizing system extracts anno- 
tations rnaddtoadocumsntaicngwimthe context sur- 
rounding each annotation and organizes (he annota- 
tions based upon (he annotation attributes and/or con- 
text. The annotations are created by grouping marks 
based upon their proximity in time and space. The doc- 



ument is segmented to determine a minimum context 
associated with each annotation. A list of the annota- 
tions sorted by (he attributes are then displayed to the 
user. The context provided by the invention (or each an- 
notation allows (he user to tuty understand (he annota- 
tion. 
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Description 

[0001] This invention is cfoected to a<tocumenlor^Jv 
izing system. In particular, this tnvenlion b directed lo a 
method and a system lor organizing documents based 
upon the context of annotations made to those docu- 
ments. 

{0002] When people read paper documents, they of- 
ten make annotations to highlight interesting or contro- 
versial passages and to record their reactions. Common 
annotations Include margin notes, vertical bars, stars, 
circles, underlines, Hghfights, etc. Two advantages of 
annotating tirectfy on the page are its tow overhead and 
convenience. One disadvantage is thai the recorded in- 
formation is hidden and inaccessible until the reader re- 
turns to (he specific page in the specific document 
{0003] To avoid this problem, soma readers use a 
separate reading notebook to record their annotations. 
A reading notebook is useful because B provides a sep- 
arate sumn^aryc^vi^nai the user has readalong with any 
commentary. The advantage of a reading notebook is 
that a permfts a quick review of the material because a 
generally has less information to browse and search 
than the original document One Disadvantage ofaread- 
tng notebook, however, is that the reader must recreate 
the context for each note to luDy understand the mean- 
ing d each note. 

{0004] Readers also use note cards lo organize 
notes. The advantage of a note card system is that the 
cards can be easOy reorganized. However, as win a 
reading notebook, unless the reader recreates B, there 
Is no context avafiable to permit the user to fuQy under- 
stand the notes. Additionally, each note must be cate- 
gon^cfrto the cc<rect card taiorett canted 
[0005] Handwrtten notes and keywords are used rn 
a system known as •Marquee* to Index video. This sys- 
tem Is described in 'Marquee: ATool for Real-Time Vid- 
eo Logging*. K. Webber et al, Proceeo1nosolCHl*94, 
April 1994, pp. 58*64, Incorporated herein by reference 
in its entirety. In 'Marquee', notes are synchronized to 
a video string with time zones that are created with hor- 
izontal Bne gestures. Keywords are identified by (he user 
by circling the words and notes that the user has select- 
ed as keywords. The keywords are assigned to the lime 
zone n which the keyword is created. Keywords also 
may be assigned directly by the user by typing the key- 
wordin manuafy. Because the keywords are associated 
with time the user can view an Irxtod time zones and 
go clrectry to the video by selecting a time zone using 
an index dlteprevioustylfontir^ 
tkm Although "Marquee" uses annotations to index a 
video document, ft does not combine the annotations 
w&h the document in a visual way. 'Marquee* is thus 
analogous to notetaking in a separate notebook rather 
than on the document itself . 
(000$) •Oynomfte* is a free-form digital "ink' note- 
book. Tr^o^ta) ink notebook is a pen-based o 
that (he user controtB by writing with a pen directly on 



the screen of the computer. The computer senses the 
location and the pos&ions traversed as the pen moves 
across the display and assigns Ink marks that corre- 
spond wfth the positions of the pen. These ink marks 
5 are called dig&ai ink because the ink is described by the 
computer digitally. Dynomfle extracts the fox, assigns 
properties to each ink mark and can present a list of the 
ink marks sorted by the assigned properties. This list is 
known as an ink index This system is described bi co- 
io assigned and copending EP Patent Application No. 
93302127.0 emitted "System for Capturing and Retriev- 
ing Autfio Data and Corresponding Handwritten Notes', 
and "Dynomite: A Dynamically Organized Ink and Awfio 
Notebook', bv L Wilcox et al. In CHI 97 Conference 
»5 Proceedinas. ACM Press. 1997. pp. 1B6»193l jncorpo- 
ratsd herein by reference to their entireties. This ink in- 
dex shows' a type' of the 'ink' along with a time stamp 
and provides (inks to the original notebook pages. Dyn- 
amite's ink index provides 'ink' marks Dnked to the cor- 
se responding tuB notebook page. However. Dyrtomile or- 
ganizes only the ink notes themselves and not the as- 
sociated information. 

(0007) 'ComMentor* is a platform for shared annota- 
tions that attaches text-based comments to locations 

« within web documents. This system is described in 
. "Shared Web Annotations as a Ptattorm for Third-Party 
Value-Added Information Providers: Architecture, Pro- 
tocols, and Usage Examples", by M. Roscheisen.etaL, 
Technical Report STAN-CS-TR-97-1582. Stanford tnte- 

» grated Digital Library Project Computer Science De- 
partment. Stanford University. November 1994. Updat- 
ed April 1 995, iricorporated herein by reference in its en- 
tirety. Annotatione are grouped into sets. A user can fHter 
these sets and tour through documents w&hh asst. A 

# tourwinAwshcwaDstcrlarmotalkxis,e 

shown w&h (he documem tale of the annotated docu- 
ment and a number of annotation attributes. CBckJrxj on 
the annotation causes the display to |ump to the source 
document at the position of th* annotation. ComMentor 

<o uses Hlered annotauons to produce lists of read docu- 
ments, but does not support paper-fike annotations or 
present lists of annotations in context. 
{0008] Classroom 2000 is a system lor capturing a 
lecture using recorded aurfio, prepared visual materials 

45 and handwritten notes made on a tisptey overlay of 
vtewgraphs. This system is descrfced In 'Classroom 
2000: Enhancing Classroom Interaction and Review", 
bvG.AbowdetalJnProceedlnQ90>CSCW , 98. fcterch 
1996, incorporated herein by reference in Is entirety 

so Searching the text In the viewgraphs retrieves the view- 
graphs along with the overlaid notes. 
(0009] The Freestyle system, which was developed 
at Vtong Laboratories, Is a mecrtanismtwsketchmgand 
writing on screen snapshots or on sheets of electronic 

55 paper. Freestyle records cursor movement and audio as 
well as the handwriting. This system is described m 
"Rapid Integrated Design of a Multimedia Communta* 
Hon System, and Human-Computer Interface Design 1 , 
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E. Francik. Marianne Rudteffl ot at (editor). Morgan 
Kautman Polishers, Inc., 1996, incorporated herein by 
reference in Its entirety. The result to a dynamic mufti- 
mece message that can be mailed to others. Freestyle 
does not provide the ability to organize the handwritten 
annotations. 

[0010] The PENPOINT operating system for pen* 
based computers, recognizes pen gestures (or editing 
and allows arbitrary Ink* marks to be placed on top ot 
any document using an "acetate layer*. This system Is 
described In The Power of PENPOINT, by R. Carr et 
aL, Addison-Westey, Inc., 1991, Incorporated herein by 
reference in As entirety. Although both Freestyle and 
PENPOINT support tree-form document annotation, 
neither provides any way to retrieve documents based 
upon those annotations, 
[0011] ln1945VafmevarBushc*scr^ 
mesh ot trails running through a mechanized private He 
and Goran/ or rramex In • As We May Think*, tn Atlantic 
Monthly. July 1 945. pp. 101-108, irtccrporatod herein by 
reference in its entirety. These trafis ware produced as 
part ot the reading activity, and provided a way to create 
and share personal organizations of Information. Bush's 
visions were semkiaJ En the development of hypermedia 
systems such as Engstbarfs NLS and the World Wide 
Web. However, hypermedia systems have focused on 
sharing, browsing and more expficH authoring of links, 
not on personal organization and annotation. 
[0012] Thus, an annotation system for electronic doc- 
uments is needed that combines the advantages of 
marking dtrectty on a document with quick accesstbiSty 
and the flexible organization of marking on note cards 
or in a notebook. 

[0013] This invention provides a system and method 
lor using digital *ink* tor annotations fri context to organ* 
Ize a readers activities. The system and method of this 
Invention extracts the contents surrounding and under- 
lying a reader's annotations and presents this Worrna- 
Uon to the reader with faks to the full context The an- 
notations in context provided by the system and rnetttod 
dthisirwentJOTpemrtsflex&te 
tion of material without adcSng to the eBort of reading 
artnotetaking. 

[0014] These and other features and advantages of 
this Invention ase descrbed In or are apparent from the 
following detailed description of the preferred emboo> 
ments. 

[0015] The preferred embedments of this invention 
wiD be described tn deted, with reference to the following 
figures, wherein: 

Fig. 1 is a btockdiagramof the oocuriietf organizing 
system of mis invention; 

Fig. 2 is a flow chart outlining the control routine of 
one embodiment of this frrvenfon; 
Fig. 3 shows a document annotated according to 
this Invention; 

Fig. 4 shows the annotated portions of the docu- 



■ menfl of Fig. 3 

Rg. 5 shows another view of the annotated docu- 
ment of Fig. 3; and 
Fig6battowchartoutlirrir^ 
5 routtrradoneernbodmemolthis invention. 

[0016] Fig. 1 is a btock diagram of one embodiment 
of the electronic document organizing system 10 of this 
invention. The system 10 has a processor 12 comrmi- 

»o nteatlng wfth a display 14, a first storage device 16, a 
second storage device tSandejiiripui/outpullrtertace 
20. The first storage device 16 stores a document 22 
disptayabteon the display 14. The Input/output interface 
20 conimunicates with any number of conventional in- 

« put/output devices 24 such as a mouse 26, a keyboard 
28 and/or a pen-based device 30. A user manipulates 
the irtput/butput devices 24 to annotate the document 
22 when Displayed on the display 14. The system 10 
then stores these annotations 32 in the second storage 

» device 18. 

[0017] As shown In r^1, the system 10 is r^erabry 
implemented using a programmed general purpose 
computer. However, the system 10 can also be Imple- 
mented using a special purpose computer, a pro- 

« p/ammed microprocessor or microcontroller and any 
necessary peripheral integrated circuit elements, an 
ASIC or other integrated circuit, a hardwired electronic 
or logic circuit such as a discrete element drcuft, a pro- 
grammable logic device such as a PLO, PLA, FPGAor 

30 PAL, or the fike.rh general, any device on which a finite 
state machine capable of implementing the flowchart 
8howninF^2csnbsusedtoimplen^the6ysteml0. 
[0016] AoWUonaliy, as shown In Fig. 1, the memories 
16 and 18 are preferably implemented using static or 

35 cVnamfcRAM. However, (he memories 16 and 18 can 
also be in^fernemed using a floppy cfekardc3sk drive, 
awrit8fc!eopllcaJc3skanddlskoM^ 
memory or the Oka. AdcfiUonafly, ft should be appreciated 
that the memories 16 and 18 can be e&herdfclinctpor* 

40 ucroot a sinc> memory cff^^ 

[0019] Furthermore, it should be appreciated that the 
link 1 7 connecting the memory 16 e^lto processor 10 
can be a wired or wfrefess fink to a network (not shown). 

«s The network can be a local aiea netwc^K a wide area 
network, an intranet the Internet or any«h«distributed 
processing and storage network. In this case, the elec- 
tronic document 22 is pulled from a physlca&y remote 
memory 16 through the fink 17 for processing in the 

50 processor 10 according to the method outlined below 
In this case, the electronic document 22 can be stored 
locany in a portion of Ihe memory 18or some other mem- 
ory (rict shown) d trte system 10. 
[0020] The method of this invention includes three 

« tfstirtct processes. First, the reader makes annotations 
on a ofepteyed document, and the annotations are ex* 
tracted along wfih their context Second, ihe system as- 
sociates a number of attributes with the annotations in 
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Oder to facilitate retrieval of the annotations andfer the 
underlying annotated documents. Third, the reader 
views collections ol (he annotations in context, where 
the collections are organized by those atlrfoutes. 
[0021] The system 10 records annotations on elec- 5 
tronte documents. A preferred tntertace tor entering the 
annotations is a pen-based computer, where the reader 
•writes* directly on the electronic document On a desk- 
top computer without a pen, dieting a mouse in a margin 
might create a text overlay boot to create the annotation. 
The system 10 may also support a number of different 
styles dntaiking. For example, these styles can include 
swiping with a highlighter pen, underflntng text, vertical 
bars fri the margin, circled regions, and margin notes. 
[00239 F)g.2isanowchano«itaningacC4itrdnx«^ 
d one embodiment d the Invention. The control routine 
starts at step S100 and proceeds to step Si 10, where 
the user marks on the o^tayd the (xx^nent wiihdig- 
Ital ink to annotate it. The control routine then proceeds 
to step St 20, where the system groups the marks d the 
digital trtk by time andfer space into collected marks, 
treated as a single annotation as wifl be descrfced in 
more detail below. Next, the contrd routine proceeds to 
step St 30, where the system determines the minimum 
context for each annotation. The system has a minimum 
context that determines how much of the document that 
surrounds the annotation is to be associated with the 
amotation. The minimum context may be predeter- 
mined as a user preference to be a few words, a sen- 
tence, a paragraph or any other amount in accordance 
wfth the user's preferences. The mttmum context can 
be displayed to (he user as a bounding box around the 
minimum context. The bounding box encloses the 
bounding region and the minimum context Is defined as 
the content enclosed wfihin the bounding region d the 
corresponding annotation. Segmentation procedures 
are appfled to the document to divide ft No graphical 
components, e.g., tines ,d text* sentences! paragraphs 
and figures. Given the minknum context, the control rou- 
tine expands the context to include an d the nearby seg- 
ments. With this procedure, the context may include a . 
couple d fines, the surrounding sentence, or the entire 
surrounding paragraph. Fig. 5 shows a bounding box 34 
with the context around a circle annotation 3a 
[0023] TheanrtotationcorTtrdnxitheiS6hawnnF^ 
6. The contrd routine starts at step $200 and proceeds 
to step S210 where the user selects and opens an elec- 
tronic document The user then starts marking on the 
dccun^atstepS220andcreatesolgftaltnk.Thesy6' 
tern then determines at step S230 if the new ink is dose 
enough In time and space to be associated wfih previous 
ink marks. The system has time and space thresholds 
that may be predetermined or adjusted in accordance 
wflh a users preferences. II the system determines at 
step S230 that the ink marks are not separate the sys- 
* tern proceeds to step S240 where the user continues to 
mark As each mark is entered by the user steps $230 
and S240 are repealed until the system determines that 



the new ink Is separated enough by time and space to 
proceed to step S2S0. At step S250 the ink marks are 
grouped together as a single annotation and at step 
S260 the context lor the annotation is determined and 
the attributes are assigned to the annotation. The con- 
trd routine then proceeds to step S270 where (he sys- 
tem determines 0 a new mark has been input If a new 
mark has been input the contrd routine returns to step 
S230.lt no new mark is entered at step S280 then the 
annotations are organized and displayed. The contrd 
routine then stops at step S290. 
(0024) For some special annotation formats such as 
those shown in Fig. 5, the contrd routine determines the 
context sfi#itry differently. For margin bars 36 and other 
rates in the margin 38, (he system ignores the horizontal 
distance when finding nearby segments. Thus, all vei- 
tfcaRy adjacent material is tnctuded in the contexts 40 
and 42; respectively. For the One callouts and circle call* 
outs, the contrd routine determines the rnihimum con- 
texts and bom the underlined or ended text, etc. Ignor- 
ing the Ink in the cailout gesture. 
[0025] Arts; the context d each annotation has been 
determhedt the contrd routine proceeds to step S140, 
where the contrd routine assigns attributes to the anno* 
tattenstn at least one d three ways: 1)attrtoutes entered 
by the user; 2) attrfrutes inherited from the document's 
attrroutes; and 3) impDcil or explicit attributes derived 
ban the annotations themseVes, 
(0026) The user may enter attributes by interacting 
with a dialog box or by selecting from a rnarWngmenu, 
or by selecting a special pea Example, attrbutes de- 
rived from the annotations themselves include "agree", 
"disagree', 'good Idea', and ndtow-up". in addition, an- 
notation gestures such as 'exclamation point* and 
"question mark* may be Interpf eted to mean 'good ((tea* 
and tyestionable" by the system as they are entered 
on the page. Attributes may also be entered EmpScsTy, 
the most important d which is the date and toro that (he 
annotation was made end the page number at the on- 
nctalicn.Aiwtherimpticfl 
notation, e.g., highlight, circle, marginal note, etc 
[POST) Attributes may also be inferred from docu- 
ments, tn the system 10, the electronic documents are 
already associated with a variety d afirfcutes, such as 
creation date, author, providence and UUe. 
[0028] After the attributes are assign toe 
tattoo at step SI 40. the contrd routine proceeds to step 
$150, where the annotations are organized, ordered or 
ranked by the assigned attributes. Subsequently, the 
contrd routine proceeds to step S160, where the anno- 
tations are displayed for me user. The control routine 
then proceeds to step S170. where the contrd routine 

[0029] The system 10 visuafly presents the annota- 
tions in context using dffierent est views. Lists are or- 
dered or filtered by the attributes descrfoed above. The 
system 10 allows (he reader to navigate between these 
views and the underlying electronic documents. Exam- 
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pies of ordered lists include: 

1) Ordered by lane. This view is analogous to a 
reader's notebook, but ateo automatically includes 
the context of each annotation, as shown in Fig. 4, 
without further effort by (he user. 

2) Filtered by attribute*. Passages across a number 
of cements are fisted in one view; 

3) Filtered by (he type of adjacent materia). For ex- 
ample, annotations of pictures 8tong wfth me pic- 
tures themselves; and 

4) Filtered by the content of adjacent material For 
example, annotated passages mentioning patent 
feather shoes are ranked in relatedness using 
known information retrieval techniques. 

[0030] It is to be understood thai the term annotation 
as used herein is intended to include text, digital ink, 
ajidfevioto or any olhej Input ass 
men! It is ateo to be unitefstood that the term dcoument 
is intended to include text video, audio and any other 
media and any combination of media Further, It is to be 
understood that the term text Is intended to Include text, 
digital ink. audio, video or any other content of a docu* 
men} to include the document e structure. 



Claims 

1. A methcd to display^ coite^ 

Uons from at least one document, comprising^ 

extracting at least one annotation from the at 
least one document; 

extracting a context portion for each at least 
one annotation fromaconesponding one of fte 
at toast one document; and 
assigning at least one attribute to each at least 
one extracted annotation. 

2. The method of ctaim 1, wherein the stop of assign- 
ing at least one attribute comprises assigning at 
least one user-defined attribute to each at least one 
annotation. 

3. The method of ctaim 1, wherein tne step of assign- 
ing at least one attribute comprises assigning at 
least one docurnent-based attr&ute to each at least 
one annotation. 

4. The method of any one of claims 1 to 3. wherein the 
step of extracting at least one annotation compris- 
es: 

grouping marks by time and space hto at least 
one collection, wherein each collection forms 
aii annotation; 

segmenting the at least one ckxument into a 



plurality of segments based on the at least one 
annotation; 

deterrnining a minimum context portion for 
each annotation, Irom the segments; and 
5 detstrnining the context portion based on the 

segments surrounding the rnfmmum context 

S. The rrmthod of any orw of claims 1to4,fu^ 
prising: 

10 

ordering the at least one annotation based on 
the at least one attrfoute; and 
c%playtng an ordered Bst of the at toast one an- 
notation atong with the corresponding context 

15 

& An apparatus lor displaying annotations from at 
toast one document, the apparatus comprising; 

a memory that stores the at least one docu- 

20 ment; 

a processor that extracts at least one annota- 
tion and a context portion of (he at toast one 
document corresponding lo each at toast one. 
annotation, that assigns at toast one attrfcute 

» to each at least one extracted annotation that 

orders the at least one aruiotations based on 
the at least one assigned attribute; and 
a display (hat displays an ordered list of the at 
toast one annotations and (he at toast one cor- 
responding context portion. 

7. The apparatus of claim 6; wherein, tor each al teast 
one annotation, the processor assigns the at least 
one attribute to that annotation based on at toast 
3S one docurronl-based attribute. 

a The apparatus of claim 6, wherein, for each at least 
one annotation, the processor assigns the at toast 
one attribute to that annotation based on at toast 
40 one attribute derived from (he context portion cor- 
responding to that annotation. 

fit The apparatus ofdaim 6. wherein when one ol the 
at toast one annotation is a mark annotation the 
processor assigns the at teast one attribute based 
upon that mark annotation. 

10, The apparatus of any erred claims 6 to 9. wherein: 

£> the processor extracts the at teast cwartnota- 

lion by grouping at toast one user generated 
mark on the al toast one document based on 
the time ol the at toast one mark and a location 
within the document of the at toast one mark 

** Into at least one collection, whereh each col- 

lection farms a single annotation, the pfccesso/ 
determines a minimum context portion of the 
corresponding document for each annotation; 
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and 

determines the context tor each annotation 
based on the annotation and the minimum con- 
text. 
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acf.l When back Botet are usee* tor to 



eth emeft^moeaa- Wem 

^fj^^^^^Pr^^^Thittiagtj aa in Scotfanei paper 

itself to e coatJotoabie pan of tte evcabto be- 
twten dealers and ceaaaatn. Btiere eta act of parliament, 
which pat a itop to the drcafadea of tea aad fire dpaC. 

^rftgAtfaa.flnthe 




iy mean peowe are u 
banfccm r T peSS^ 
or em lor twenty * 



Wbm the Imifliof bank aotea 
b allowed aad commonly practised* many 
both enabled and eaeeangH ito tommo jSSSSSU" £"""ViQ 
whose promissory note for die poaada* or ena for twenty *rU 
tbimogK woold be rejected by ettry body, wiD jet it to be 39 
1 f^rt tf^iiwhm it h Itttwt far no wall a tow / 

M nri«c*te/lB«ttefrcew^ 

- - - enskn a very cca-yC* 

crca a very great! 



beggarly baakcra mast be liab le, nay 
elderable iacotrteoJeaey, and sometime 
calamity, to many peer peeele w>* ^ Tfttd tbetr notaol 

m wmcaLt _ 

it ^tifftftttf. perhaps, that no bank aotea were tsmea 
is any part of the kindoa tor a vmaOcr son than topoemds. 
Paper awry woaid tbea, piebamy, confine itaeft toewy 
pan of tte Undent, to the drcohtion between tte different 
aa math aa It dote aa present ia 
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